Add a dedicated OpenAI-compatible LLM adapter#1895
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughThis PR adds support for OpenAI-compatible LLM providers (e.g., via LiteLLM) by introducing a new adapter type, parameter validation with model normalization, improved usage token handling, and corresponding tests and configuration. ChangesOpenAI-Compatible LLM Adapter Feature
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
for more information, see https://pre-commit.ci
|
| Filename | Overview |
|---|---|
| unstract/sdk1/src/unstract/sdk1/adapters/base1.py | Adds OpenAICompatibleLLMParameters; validate() still mutates the caller's dict and model is absent from the JSON schema required array — both already flagged in open review threads. |
| unstract/sdk1/src/unstract/sdk1/llm.py | _record_usage now prefers reported prompt_tokens and degrades gracefully on token_counter failure; warning message text matches the test assertion correctly. |
| unstract/sdk1/src/unstract/sdk1/adapters/llm1/openai_compatible.py | Clean adapter implementation using SCHEMA_PATH class var, proper MRO (parameters first, then BaseAdapter), and distinct provider/icon values. |
| unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/openai_compatible.json | Schema is well-formed; model field is still absent from the required array (flagged in previous thread). |
| unstract/sdk1/tests/test_openai_compatible_adapter.py | Tests cover registration, model normalisation, schema loading, and usage recording; magic stub and schema assertion concerns from previous threads are now addressed. |
| unstract/sdk1/src/unstract/sdk1/adapters/llm1/init.py | Registers OpenAICompatibleLLMAdapter correctly alongside existing adapters. |
Sequence Diagram
sequenceDiagram
participant UI as UI / Caller
participant LLM as LLM.__init__
participant Adapter as OpenAICompatibleLLMAdapter
participant Params as OpenAICompatibleLLMParameters
participant LiteLLM as LiteLLM
UI->>LLM: LLM(adapter_id, adapter_metadata)
LLM->>Adapter: validate(adapter_metadata)
Adapter->>Params: validate_model(adapter_metadata)
Params-->>Adapter: "custom_openai/{model}"
Adapter->>Params: "OpenAICompatibleLLMParameters(**metadata).model_dump()"
Adapter-->>LLM: kwargs (model, api_base, api_key, ...)
LLM-->>UI: LLM instance
UI->>LLM: complete(prompt)
LLM->>Adapter: "validate({**self.kwargs, **kwargs})"
Adapter-->>LLM: completion_kwargs
LLM->>LiteLLM: "litellm.completion(messages, **completion_kwargs)"
LiteLLM-->>LLM: response (with usage)
LLM->>LLM: _record_usage(model, messages, usage, complete)
alt usage has prompt_tokens
LLM->>LLM: use reported prompt_tokens directly
else prompt_tokens absent or zero
LLM->>LiteLLM: token_counter(model, messages)
alt token_counter succeeds
LiteLLM-->>LLM: estimated prompt_tokens
else token_counter raises
LLM->>LLM: log warning, record 0 prompt_tokens
end
end
LLM->>LiteLLM: cost_per_token(model, prompt_tokens, completion_tokens)
LiteLLM-->>LLM: (prompt_cost, compl_cost)
LLM->>LLM: append to _pending_usage
LLM-->>UI: response LLMResponseCompat
Reviews (8): Last reviewed commit: "Reject empty model string in OpenAICompa..." | Re-trigger Greptile
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
unstract/sdk1/src/unstract/sdk1/llm.py (1)
542-557: Avoid unconditional token estimation when usage already includes prompt tokens.This currently computes
token_counter()even when provider usage already has prompt tokens, which can create repeated warnings/noise for unmapped models without improving recorded usage.♻️ Proposed refinement
- try: - prompt_tokens = token_counter(model=model, messages=messages) - except Exception as e: - prompt_tokens = 0 - logger.warning( - "[sdk1][LLM][%s][%s] Failed to estimate prompt tokens: %s", - model, - llm_api, - e, - ) usage_data: Mapping[str, int] = usage or {} + prompt_tokens = usage_data.get("prompt_tokens") + if prompt_tokens is None: + try: + prompt_tokens = token_counter(model=model, messages=messages) + except Exception as e: + prompt_tokens = 0 + logger.warning( + "[sdk1][LLM][%s][%s] Failed to estimate prompt tokens: %s", + model, + llm_api, + e, + ) all_tokens = TokenCounterCompat( - prompt_tokens=usage_data.get("prompt_tokens", 0), + prompt_tokens=usage_data.get("prompt_tokens", prompt_tokens or 0), completion_tokens=usage_data.get("completion_tokens", 0), total_tokens=usage_data.get("total_tokens", 0), )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@unstract/sdk1/src/unstract/sdk1/llm.py` around lines 542 - 557, The code unconditionally calls token_counter(model, messages) even when usage already contains prompt token counts; change the logic in the block around token_counter and TokenCounterCompat so you first check usage (usage_data = usage or {}) and if usage_data.get("prompt_tokens") is present use that value for prompt_tokens instead of calling token_counter; only call token_counter(model, messages) inside the try/except when usage_data lacks prompt_tokens, preserving the existing exception handling and the logger.warning path, and then construct TokenCounterCompat using the values from usage_data (falling back to the estimated prompt_tokens when used).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/custom_openai.json`:
- Around line 15-20: The schema for the "api_key" property currently only allows
a string which fails when runtime metadata contains null; update the "api_key"
entry in the JSON schema (the "api_key" property in custom_openai.json) to
permit null values by changing its type to accept both string and null (or add a
nullable:true equivalent) so stored configs with null pass validation and
editing flows.
---
Nitpick comments:
In `@unstract/sdk1/src/unstract/sdk1/llm.py`:
- Around line 542-557: The code unconditionally calls token_counter(model,
messages) even when usage already contains prompt token counts; change the logic
in the block around token_counter and TokenCounterCompat so you first check
usage (usage_data = usage or {}) and if usage_data.get("prompt_tokens") is
present use that value for prompt_tokens instead of calling token_counter; only
call token_counter(model, messages) inside the try/except when usage_data lacks
prompt_tokens, preserving the existing exception handling and the logger.warning
path, and then construct TokenCounterCompat using the values from usage_data
(falling back to the estimated prompt_tokens when used).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: bf841637-54b7-4802-9156-7f56e899ca54
📒 Files selected for processing (7)
README.mdunstract/sdk1/src/unstract/sdk1/adapters/base1.pyunstract/sdk1/src/unstract/sdk1/adapters/llm1/__init__.pyunstract/sdk1/src/unstract/sdk1/adapters/llm1/openai_compatible.pyunstract/sdk1/src/unstract/sdk1/adapters/llm1/static/custom_openai.jsonunstract/sdk1/src/unstract/sdk1/llm.pyunstract/sdk1/tests/test_openai_compatible_adapter.py
|
Addressed the review follow-ups.
Validation re-run:
|
|
Gentle follow-up on this PR in case it slipped through the queue. When someone has bandwidth, I would really appreciate a review. Happy to make any follow-up changes quickly. Thanks! |
jaseemjaskp
left a comment
There was a problem hiding this comment.
PR Review Toolkit — consolidated findings
Automated review aggregating six specialist agents (code-reviewer, code-simplifier, silent-failure-hunter, type-design-analyzer, pr-test-analyzer, comment-analyzer). No blocking defects; the adapter follows existing sibling-adapter conventions and the scope is appropriately narrow.
High-signal items worth addressing before merge
llm.py:556— redundant/confusing fallback:prompt_tokensis already resolved at line 543 with an explicit estimation branch, so theusage_data.get("prompt_tokens", prompt_tokens or 0)expression double-handles the default and, worse, silently coerces an explicitNonefromusage_datato 0 without logging.llm.py:547— broadexcept Exceptioncombined with a warning that says "failed to estimate" but not "recording 0 tokens" means billing/audit can silently under-report. Narrow the exception and either uselogger.exceptionor rewrite the message to name the consequence.base1.py:232/ schemaapi_base— the Pydantic type is plainstr; URL shape lives only in the JSON schema, so direct construction accepts garbage. ConsiderHttpUrl/ afield_validator.base1.py:239— prefix logic is only invoked by thevalidateclassmethod.AzureOpenAILLMParametersuses@model_validator(mode="before")which cannot be bypassed; mirroring that tightens the invariant.custom_openai.json—api_baseis listed as required yet ships with a placeholder default URL, which lets users save an unchanged form and hit 404s at request time instead of validation errors. Vendor-specific examples (ERNIE-4.0-8K (Baidu Qianfan),qianfan.baidubce.com) are prone to rot and should be generic.test_openai_compatible_adapter.py— the "tolerates unmapped models" test only assertspush_usage_data.assert_called_once(); it does not verifyprompt_tokens=0was actually pushed, so a regression silently pushingNoneor crashing insideTokenCounterCompatwould still pass.
Suggested follow-ups (non-blocking)
- Add tests for:
api_basemissing (Pydantic ValidationError),usage=None,usage["prompt_tokens"]=None, and the success branch oftoken_counter. - Drop the unused
@lru_cacheon_load_llm_moduleand the dead_load_llm_class. - Replace the tautological
get_description()/ metadata description with user-facing copy that distinguishes this adapter fromOpenAILLMAdapter.
Inline comments below flag each item at its exact line with a concrete fix suggestion.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (5)
unstract/sdk1/tests/test_openai_compatible_adapter.py (3)
102-192: LGTM — good coverage of the three_record_usagebranches.Tests exercise: (a) provider-supplied
prompt_tokensbypassestoken_counter, (b)token_counterraising falls back to 0 with a warning, and (c)prompt_tokens=Nonetriggers estimation. The use of__new__to bypass__init__and the targetedpatch.object(llm_module, ...)are appropriate here. Nice to see the warning-message assertion on line 162 pinning the audit-visible text.One small suggestion: assert
mock_warningis called exactly once and withmodel/llm_apisubstituted into the format string, to catch regressions that accidentally change the log signature.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@unstract/sdk1/tests/test_openai_compatible_adapter.py` around lines 102 - 192, Update the test_record_usage_tolerates_unmapped_models_without_prompt_tokens test to assert the warning logger was called exactly once and that the warning message includes the model ("custom_openai/gateway-model") and llm_api ("complete") values; locate the test function and the mock_warning (patched via patch.object(llm_module.logger, "warning")) and after calling llm._record_usage add assertions that mock_warning.assert_called_once() and that the call_args contains both the model and llm_api strings in the formatted warning message to catch signature regressions.
18-31: Minor:lru_cachearoundimport_moduleis largely redundant.
sys.modulesalready caches modules after first import, so thelru_cache(maxsize=1)only saves thepatch.dictcontext-manager overhead. Leaving it is harmless, but on second and subsequent calls themagicstub will not be re-installed (because the cached branch returns early), so any test that newly triggersimport magicafter the first call would see the real module. Not a problem today, but worth documenting with a short comment to prevent surprise.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@unstract/sdk1/tests/test_openai_compatible_adapter.py` around lines 18 - 31, The `@lru_cache` on _load_llm_module() prevents the patch.dict stub for "magic" from being re-applied on subsequent calls, which can lead to surprising behavior if tests later import the real magic module; either remove the `@lru_cache` decorator or (preferred) keep it but add a brief comment inside _load_llm_module explaining that sys.modules already caches imports and that the cached result means the "magic" stub will not be re-installed on later calls so tests should call this once or manage stubbing themselves — reference the _load_llm_module function and the patch.dict usage when adding the comment.
34-35: Dead helper:_load_llm_classis never called.Every test uses
_load_llm_module().LLMdirectly (e.g., lines 104, 135, 167). Consider removing_load_llm_classor using it in place of the inlinellm_module.LLMlookups for consistency.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@unstract/sdk1/tests/test_openai_compatible_adapter.py` around lines 34 - 35, The helper function _load_llm_class is unused (dead code); either remove _load_llm_class entirely or replace direct usages of _load_llm_module().LLM in tests (e.g., the inline lookups at places that call llm_module.LLM) with calls to _load_llm_class() for consistency. Locate the definition of _load_llm_class and the test files referencing _load_llm_module().LLM and either delete the unused _load_llm_class function or update those tests to call _load_llm_class() instead, ensuring imports and type annotations still match.unstract/sdk1/src/unstract/sdk1/adapters/base1.py (1)
234-242: Minor:validate()mutates the caller's dict.Lines 236 and 241 write back into
adapter_metadata(same pattern asOpenAILLMParameters, but unlikeVertexAILLMParameterswhich copies first via{**adapter_metadata}). GivenLLM.complete()callsself.adapter.validate({**self.kwargs, **kwargs})(a fresh dict), there's no current bug — but if a future caller passes a long-lived dict, theapi_keywould be mutated in place and the model prefix would double-rewrite on a second call (thestartswith("custom_openai/")guard invalidate_modelmitigates the latter).Optional: copy first for defensive hygiene, matching the
VertexAILLMParameters.validate()style.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@unstract/sdk1/src/unstract/sdk1/adapters/base1.py` around lines 234 - 242, The validate() method currently mutates the incoming adapter_metadata dict (it writes to adapter_metadata["model"] and adapter_metadata["api_key"]); to avoid in-place side effects make a shallow copy first (e.g., metadata = {**adapter_metadata}) and perform all modifications against that copy before passing it to OpenAICompatibleLLMParameters.validate_model and constructing OpenAICompatibleLLMParameters(**metadata).model_dump(); keep references to the same symbols (validate, OpenAICompatibleLLMParameters.validate_model, OpenAICompatibleLLMParameters, adapter_metadata) so the change is local and preserves existing behavior while preventing caller dict mutation.unstract/sdk1/src/unstract/sdk1/llm.py (1)
557-557: Minor:prompt_tokens or 0also zeroes out legitimate0from provider.If a provider ever reports
usage.prompt_tokens == 0(unusual, but possible for zero-content requests or certain gateways), the truthiness check collapses it the same asNone. Given theprompt_tokens is Nonebranch already assigns an int (or 0 on exception), thisor 0is only needed to satisfy the type checker. A more precise form:Proposed tweak
- all_tokens = TokenCounterCompat( - prompt_tokens=prompt_tokens or 0, + all_tokens = TokenCounterCompat( + prompt_tokens=prompt_tokens if prompt_tokens is not None else 0, completion_tokens=usage_data.get("completion_tokens", 0), total_tokens=usage_data.get("total_tokens", 0), )Low-impact since 0 vs None ends up the same in the audit row, but semantically cleaner.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@unstract/sdk1/src/unstract/sdk1/llm.py` at line 557, Replace the truthiness fallback that zeroes out legitimate zero values: instead of using "prompt_tokens=prompt_tokens or 0" keep the explicit None-check so only None becomes 0 (e.g., use a conditional expression that assigns prompt_tokens if prompt_tokens is not None else 0). Locate the occurrence of the "prompt_tokens=prompt_tokens or 0" assignment and change it to an explicit None-check for the variable prompt_tokens so a reported 0 remains 0.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@unstract/sdk1/src/unstract/sdk1/llm.py`:
- Around line 543-560: The current catch in the prompt token estimation around
token_counter (used when building TokenCounterCompat) silently sets
prompt_tokens=0; update this to (1) narrow the except to only expected errors
from the estimator (e.g., KeyError/ValueError and litellm-specific exceptions
raised by token_counter) so unexpected errors still propagate, and (2) add a
sentinel field to the usage payload (e.g., prompt_tokens_source or
estimation_failed) before calling Audit().push_usage_data to mark that prompt
tokens were estimated/failed, and/or increment an ops metric/counter when the
fallback path occurs; reference the token_counter call, TokenCounterCompat
construction, Audit().push_usage_data, and the existing logger to emit a clear
warning and metric.
---
Nitpick comments:
In `@unstract/sdk1/src/unstract/sdk1/adapters/base1.py`:
- Around line 234-242: The validate() method currently mutates the incoming
adapter_metadata dict (it writes to adapter_metadata["model"] and
adapter_metadata["api_key"]); to avoid in-place side effects make a shallow copy
first (e.g., metadata = {**adapter_metadata}) and perform all modifications
against that copy before passing it to
OpenAICompatibleLLMParameters.validate_model and constructing
OpenAICompatibleLLMParameters(**metadata).model_dump(); keep references to the
same symbols (validate, OpenAICompatibleLLMParameters.validate_model,
OpenAICompatibleLLMParameters, adapter_metadata) so the change is local and
preserves existing behavior while preventing caller dict mutation.
In `@unstract/sdk1/src/unstract/sdk1/llm.py`:
- Line 557: Replace the truthiness fallback that zeroes out legitimate zero
values: instead of using "prompt_tokens=prompt_tokens or 0" keep the explicit
None-check so only None becomes 0 (e.g., use a conditional expression that
assigns prompt_tokens if prompt_tokens is not None else 0). Locate the
occurrence of the "prompt_tokens=prompt_tokens or 0" assignment and change it to
an explicit None-check for the variable prompt_tokens so a reported 0 remains 0.
In `@unstract/sdk1/tests/test_openai_compatible_adapter.py`:
- Around line 102-192: Update the
test_record_usage_tolerates_unmapped_models_without_prompt_tokens test to assert
the warning logger was called exactly once and that the warning message includes
the model ("custom_openai/gateway-model") and llm_api ("complete") values;
locate the test function and the mock_warning (patched via
patch.object(llm_module.logger, "warning")) and after calling llm._record_usage
add assertions that mock_warning.assert_called_once() and that the call_args
contains both the model and llm_api strings in the formatted warning message to
catch signature regressions.
- Around line 18-31: The `@lru_cache` on _load_llm_module() prevents the
patch.dict stub for "magic" from being re-applied on subsequent calls, which can
lead to surprising behavior if tests later import the real magic module; either
remove the `@lru_cache` decorator or (preferred) keep it but add a brief comment
inside _load_llm_module explaining that sys.modules already caches imports and
that the cached result means the "magic" stub will not be re-installed on later
calls so tests should call this once or manage stubbing themselves — reference
the _load_llm_module function and the patch.dict usage when adding the comment.
- Around line 34-35: The helper function _load_llm_class is unused (dead code);
either remove _load_llm_class entirely or replace direct usages of
_load_llm_module().LLM in tests (e.g., the inline lookups at places that call
llm_module.LLM) with calls to _load_llm_class() for consistency. Locate the
definition of _load_llm_class and the test files referencing
_load_llm_module().LLM and either delete the unused _load_llm_class function or
update those tests to call _load_llm_class() instead, ensuring imports and type
annotations still match.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: bb92694d-b745-40e9-8bd9-0bc1fa3628b1
⛔ Files ignored due to path filters (1)
frontend/public/icons/adapter-icons/OpenAICompatible.pngis excluded by!**/*.png
📒 Files selected for processing (5)
unstract/sdk1/src/unstract/sdk1/adapters/base1.pyunstract/sdk1/src/unstract/sdk1/adapters/llm1/openai_compatible.pyunstract/sdk1/src/unstract/sdk1/adapters/llm1/static/custom_openai.jsonunstract/sdk1/src/unstract/sdk1/llm.pyunstract/sdk1/tests/test_openai_compatible_adapter.py
✅ Files skipped from review due to trivial changes (2)
- unstract/sdk1/src/unstract/sdk1/adapters/llm1/openai_compatible.py
- unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/custom_openai.json
|
Hi @jimmyzhuu - sorry we let this sit so long without a review, that's on us. The change looks reasonable in scope and the validation looks solid. If you're still interested, we'd be happy to have you reopen it and we'll get a maintainer on it this week. Either way, thanks for the careful work and the patience. |
|
Absolutely — I’d be happy to work on this. Thanks for the suggestion! |
chandrasekharan-zipstack
left a comment
There was a problem hiding this comment.
LGTM for the most part, @pk-zipstack please help take a look as well
jaseemjaskp
left a comment
There was a problem hiding this comment.
LGTM. After running the PR review toolkit (code-reviewer, comment-analyzer, pr-test-analyzer, silent-failure-hunter, type-design-analyzer, code-simplifier), all clearly important findings (correctness, billing/audit data integrity, broken contracts) were already raised by prior reviewers (greptile, coderabbitai, jaseemjaskp) and addressed by the author in the latest commits. Remaining items from agent passes are minor or NIT-level (e.g. tightening api_base/model pydantic validators, schema title/filename naming, end-to-end LLM(init) coverage), so I'm not posting them as inline comments. Resolved my three previously-posted threads that the current code addresses (DESCRIPTION constant, dedicated icon, blank api_key coercion).
|
@jimmyzhuu The one thing I'd note is that the exception-swallowing path now applies to every adapter, not just custom_openai. I think that's the right tradeoff (a successful LLM call shouldn't fail at the billing step), but flagging it for the merge commit. |
@athul-rs Updated the PR description. |
Resolve conflict in unstract/sdk1/src/unstract/sdk1/llm.py by taking main's version of _record_usage. This PR's prompt_tokens fallback is subsumed by main's existing litellm.token_counter() fallback, and the TokenCounterCompat / Audit().push_usage_data() shape this PR retained is dead post-Lookups-V2 refactor (usage now flows through self._pending_usage). Update the three _record_usage tests in test_openai_compatible_adapter.py to assert on the _pending_usage carrier rather than the removed Audit pattern. Also fix a latent sys.modules rollback in _load_llm_module that broke litellm attribute resolution after the patch.dict context exited. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
validate_model previously produced "custom_openai/" for an empty model, surfacing as a confusing LiteLLM error at call time. Match the existing GeminiLLMParameters.validate_model pattern: strip whitespace, raise ValueError on empty input. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
|
| class BaseAdapter(ABC): | ||
| """Adapter base class for compatibility with all SDK v1 providers.""" | ||
|
|
||
| SCHEMA_PATH: ClassVar[str | None] = None |
There was a problem hiding this comment.
@jimmyzhuu @pk-zipstack may be a silly question. Why we had to refactor SCHEMA_PATH as a class variable? I was just wondering if using like this could have some other side effect since we use BaseAdapter as baseclass for other adapter classes?
I mean wouldn't renaming the openai_comptatible.json to custom_openai.json have avoided these changes all together.
@jimmyzhuu If the above works I would suggest doing that and reverting the introduced logic for cls.SCHEMA_PATH. While nothing inherently wrong in doing what you did we want to keep the changes to based class as minimal and none when possible.
There was a problem hiding this comment.
@ritwik-g's suggestion is correct and is the simpler path.
Looking at the code, the default get_json_schema() in BaseAdapter constructs the schema path as:
f"{os.path.dirname(__file__)}/{cls.get_adapter_type().name.lower()}1/static/{cls.get_provider()}.json"Since OpenAICompatibleLLMAdapter.get_provider() returns "custom_openai", the default path resolution would look for llm1/static/custom_openai.json. The schema file is named openai_compatible.json instead, which is the only reason SCHEMA_PATH was added to BaseAdapter.
Every other adapter in the static directory follows the convention of naming the JSON file after get_provider(): anthropic.json, bedrock.json, vertex_ai.json, ollama.json, etc. openai_compatible.json is the only outlier.
The fix is straightforward:
- Rename
openai_compatible.json→custom_openai.json - Remove
SCHEMA_PATHfromBaseAdapter - Remove the
SCHEMA_PATHoverride inOpenAICompatibleLLMAdapter
This keeps the base class untouched and aligns with the naming convention already established by all other adapters.
There was a problem hiding this comment.
🧩 Analysis chain
@ritwik-g Sure! I'll plan these three changes now.
- Rename
openai_compatible.json→custom_openai.jsonin the static directory - Remove
SCHEMA_PATH: ClassVar[str | None] = NonefromBaseAdapter - Remove the
SCHEMA_PATHoverride and itsget_json_schemausage fromOpenAICompatibleLLMAdapter
🏁 Script executed:
#!/bin/bash
# Find the schema JSON file
fd "openai_compatible.json" --type f
# Find OpenAICompatibleLLMAdapter implementation
fd "openai_compatible.py" --type f
# Check for SCHEMA_PATH usage in openai_compatible.py
rg -n "SCHEMA_PATH" --type py
Length of output: 3009
🏁 Script executed:
#!/bin/bash
# Find the schema JSON file location
fd "openai_compatible.json" --type f
# Find OpenAICompatibleLLMAdapter implementation
fd "openai_compatible.py" --type f
# Check for SCHEMA_PATH usage across the codebase
rg -n "SCHEMA_PATH" --type py -A2 -B2
Length of output: 10596
@ritwik-g Here's the full plan for the fix. Four files need to change:
- Rename
openai_compatible.json→custom_openai.jsonsoget_json_schema()auto-resolves viaget_provider()="custom_openai" - Remove
SCHEMA_PATHclass variable and theif schema_path is Nonebranch fromBaseAdapterinbase1.py; also drop the now-unusedClassVarimport - Remove the
SCHEMA_PATHoverride (and unusedos/ClassVarimports) fromOpenAICompatibleLLMAdapter - Update the test that directly accesses
SCHEMA_PATHand checks the old filename
🧬 Code Graph Analysis Results
-
File: unstract/sdk1/src/unstract/sdk1/constants.py
Lines: 4-14class Common: METADATA = "metadata" MODULE = "module" ADAPTER = "adapter" SRC_FOLDER = "src" ADAPTER_METADATA = "adapter_metadata" ICON = "icon" ADAPTER_ID = "adapter_id" ADAPTER_TYPE = "adapter_type" ADAPTER_NAME = "_adapter_name" DEFAULT_ERR_MESSAGE = "Something went wrong" -
File: unstract/sdk1/src/unstract/sdk1/constants.py
Lines: 17-23class AdapterTypes(Enum): UNKNOWN = "UNKNOWN" LLM = "LLM" EMBEDDING = "EMBEDDING" VECTOR_DB = "VECTOR_DB" OCR = "OCR" X2TEXT = "X2TEXT" -
File: unstract/sdk1/src/unstract/sdk1/adapters/vectordb/register.py
Lines: 19-30def register_adapters(adapters: dict[str, Any]) -> None: current_directory = os.path.dirname(os.path.abspath(__file__)) package = "unstract.sdk1.adapters.vectordb" for adapter in os.listdir(current_directory): adapter_path = os.path.join(current_directory, adapter, Common.SRC_FOLDER) # Check if the item is a directory and not a # special directory like __pycache__ if os.path.isdir(adapter_path) and not adapter.startswith("__"): VectorDBRegistry._build_adapter_list(adapter, package, adapters) if len(adapters) == 0: logger.warning("No vectorDB adapter found.") -
File: unstract/sdk1/src/unstract/sdk1/adapters/ocr/register.py
Lines: 19-30def register_adapters(adapters: dict[str, Any]) -> None: current_directory = os.path.dirname(os.path.abspath(__file__)) package = "unstract.sdk1.adapters.ocr" for adapter in os.listdir(current_directory): adapter_path = os.path.join(adapter_path??, adapter, Common.SRC_FOLDER) # Check if the item is a directory and not a # special directory like __pycache__ if os.path.isdir(adapter_path) and not adapter.startswith("__"): OCRRegistry._build_adapter_list(adapter, package, adapters) if len(adapters) == 0: logger.warning("No ocr adapter found.") -
File: unstract/sdk1/src/unstract/sdk1/adapters/x2text/register.py
Lines: 19-30def register_adapters(adapters: dict[str, Any]) -> None: current_directory = os.path.dirname(os.path.abspath(__file__)) package = "unstract.sdk1.adapters.x2text" for adapter in os.listdir(current_directory): adapter_path = os.path.join(current_directory, adapter, Common.SRC_FOLDER) # Check if the item is a directory and not a # special directory like __pycache__ if os.path.isdir(adapter_path) and not adapter.startswith("__"): X2TextRegistry._build_adapter_list(adapter, package, adapters) if len(adapters) == 0: logger.warning("No X2Text adapter found.") -
File: unstract/sdk1/src/unstract/sdk1/adapters/embedding1/gemini.py
Lines: 9-40def get_id() -> str: return "gemini|5c2a36b8-0b8e-4f26-82c0-9f3b564cb066" def get_name() -> str: return "Gemini" def get_description() -> str: return "Gemini embedding adapter" def get_provider() -> str: return "gemini" def get_icon() -> str: return "/icons/adapter-icons/Gemini.png" def get_adapter_type() -> AdapterTypes: return AdapterTypes.EMBEDDING -
File: unstract/sdk1/src/unstract/sdk1/adapters/llm1/openai_compatible.py
Lines: 20-51def get_id() -> str: return "openaicompatible|b6d10f33-2c41-49fc-a8c2-58d2b247fc09" def get_name() -> str: return "OpenAI Compatible" def get_description() -> str: return DESCRIPTION def get_provider() -> str: return "custom_openai" def get_icon() -> str: return "/icons/adapter-icons/OpenAICompatible.png" def get_adapter_type() -> AdapterTypes: return AdapterTypes.LLM -
File: unstract/sdk1/src/unstract/sdk1/adapters/ocr/google_document_ai/src/google_document_ai.py
Lines: 51-64def get_id() -> str: return "googledocumentai|1013f64b-ecc9-4e35-b986-aebd60fb55d7" def get_name() -> str: return "GoogleDocumentAI" def get_description() -> str: return "Google Document AI OCR" def get_icon() -> str: return "/icons/adapter-icons/GoogleDocumentAI.png" -
File: unstract/sdk1/src/unstract/sdk1/adapters/x2text/llama_parse/src/llama_parse.py
Lines: 31-40def get_id() -> str: return "llamaparse|78860239-b3cc-4cc5-b3de-f84315f75d14" def get_name() -> str: return "LlamaParse" def get_description() -> str: return "LlamaParse X2Text" def get_icon() -> str: return "/icons/adapter-icons/llama-parse.png" -
File: unstract/sdk1/src/unstract/sdk1/adapters/x2text/llm_whisperer/src/llm_whisperer.py
Lines: 49-62def get_id() -> str: return "llmwhisperer|0a1647f0-f65f-410d-843b-3d979c78350e" def get_name() -> str: return "LLMWhisperer" def get_description() -> str: return "LLMWhisperer X2Text" def get_icon() -> str: return "/icons/adapter-icons/LLMWhisperer.png" -
File: unstract/sdk1/src/unstract/sdk1/adapters/x2text/llm_whisperer_v2/src/llm_whisperer_v2.py
Lines: 42-55def get_id() -> str: return "llmwhisperer|a5e6b8af-3e1f-4a80-b006-d017e8e67f93" def get_name() -> str: return "LLMWhisperer V2" def get_description() -> str: return "LLMWhisperer V2 X2Text" def get_icon() -> str: return "/icons/adapter-icons/LLMWhispererV2.png" -
File: unstract/sdk1/src/unstract/sdk1/adapters/embeding1/gemini.py
Lines: (duplicate reference) 9-40
[Note: Refer to the Gemini embedding snippet above for the embedding1 gemini adapter.] -
File: unstract/sdk1/src/unstract/sdk1/adapters/vectordb/pinecone/src/pinecone.py
Lines: 49-62def get_id() -> str: return "pinecone|83881133-485d-4ecc-b1f7-0009f96dc74a" def get_name() -> str: return "Pinecone" def get_description() -> str: return "Pinecone VectorDB" def get_icon() -> str: return "/icons/adapter-icons/pinecone.png" -
File: unstract/sdk1/src/unstract/sdk1/vector_db.py
Lines: 192-202def add( self, ref_doc_id: str, nodes: list[BaseNode], ) -> list[str]: if not self.vector_db_adapter_class: raise VectorDBError("Vector DB is not initialised properly") self.vector_db_adapter_class.add( ref_doc_id=ref_doc_id, nodes=nodes, )
Failed to handle agent chat message. Please try again.



Summary
This PR adds a dedicated
OpenAI CompatibleLLM adapter for OpenAI-style chat completion endpoints that are not the official OpenAI service.The implementation is intentionally small in scope:
OpenAI CompatibleLLM adapter backed by LiteLLM'scustom_openaipathOpenAIadapter unchangedWhy
Users may already have access to OpenAI-compatible endpoints behind a private gateway or third-party provider, but the current
OpenAIadapter is specifically shaped around official OpenAI semantics.Using a separate adapter keeps those semantics explicit and avoids broadening the meaning of the existing
OpenAIadapter.Refs #1894
Refs #856
Refs #1443
Scope
This PR is limited to:
OpenAIadapter behaviorNotes
LLM._record_usagenow prefers provider-reportedprompt_tokenswhen they are present in the usage payload.If
prompt_tokensare missing,_record_usagestill falls back to LiteLLM token estimation. If that estimation raises, it now logs a warning and records0prompt tokens for usage audit instead of bubbling the exception up after a successful LLM call.This behavior change is in the shared
_record_usagepath, so it applies to every SDK1 LLM adapter that uses it, not justcustom_openai.This keeps successful LLM calls from failing at the usage-audit / billing step while preserving the current pricing semantics in this PR.
Validation
UV_SKIP_WHEEL_FILENAME_CHECK=1 uv run pytest tests/test_openai_compatible_adapter.pyUV_SKIP_WHEEL_FILENAME_CHECK=1 uv run ruff check src/unstract/sdk1/adapters/base1.py src/unstract/sdk1/adapters/llm1/__init__.py src/unstract/sdk1/adapters/llm1/openai_compatible.py src/unstract/sdk1/llm.py tests/test_openai_compatible_adapter.py